NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

GRID: Protecting Training Graph from Link Stealing Attacks on GNN Models

https://doi.org/10.1109/SP61157.2025.00059

Lou, Jiadong; Yuan, Xu; Zhang, Rui; Yuan, Xingliang; Gong, Neil Zhenqiang; Tzeng, Nian-Feng (May 2025, IEEE)

Free, publicly-accessible full text available May 12, 2026
Towards Robust Vision Transformer via Masked Adaptive Ensemble

https://doi.org/10.1145/3627673.3679750

Lin, Fudong; Lou, Jiadong; Yuan, Xu; Tzeng, Nian-Feng (October 2024, ACM)

Full Text Available
FedClust: Tackling Data Heterogeneity in Federated Learning through Weight-Driven Client Clustering

https://doi.org/10.1145/3673038.3673151

Islam, Md Sirajul; Javaherian, Simin; Xu, Fei; Yuan, Xu; Chen, Li; Tzeng, Nian-Feng (August 2024, ACM)

Full Text Available
A New Routing Strategy to Improve Success Rates of Quantum Computers

https://doi.org/10.1145/3649476.3658790

Qi, Fang; Fu, Xin; Yuan, Xu; Tzeng, Nian-Feng; Peng, Lu (June 2024, ACM)

In the current noisy intermediate-scale quantum (NISQ) Era, Quantum Computing faces significant challenges due to noise, which severely restricts the application of computing complex algorithms. Superconducting quantum chips, one of the pioneer quantum computation technologies, introduce additional noise when moving qubits to adjacent locations for operation on designated two-qubit gates. The current compilers rely on decision models that either count the swap gates or multiply the gate errors when choosing swap paths at the routing stage. Our research has unveiled the overlooked situations for error propagations through the circuit, leading to accumulations that may affect the final output. In this paper, we propose Error Propagation-Aware Routing (EPAR), designed to enhance the compilation performance by considering accumulated errors in routing. EPAR’s effectiveness is validated through benchmarks on a 27-qubit machine and two simulated systems with different topologies. The results indicate an average success rate improvement of 10% on both real and simulated heavy hex lattice topologies, along with a 16% enhancement in a mesh topology simulation. These findings underscore the potential of EPAR to advance quantum computing in the NISQ era substantially.
more » « less
Full Text Available
Hadar: Heterogeneity-Aware Optimization-Based Online Scheduling for Deep Learning Cluster

https://doi.org/10.1109/IPDPS57955.2024.00066

Sultana, Abeda; Xu, Fei; Yuan, Xu; Chen, Li; Tzeng, Nian-Feng (May 2024, IEEE)

With the wide adoption of deep neural network (DNN) models for various applications, enterprises, and cloud providers have built deep learning clusters and increasingly deployed specialized accelerators, such as GPUs and TPUs, for DNN training jobs. To arbitrate cluster resources among multi-user jobs, existing schedulers fall short, either lacking fine-grained heterogeneity awareness or hardly generalizable to various scheduling policies. To fill this gap, we propose a novel design of a task-level heterogeneity-aware scheduler, Hadar, based on an online optimization framework that can express other scheduling algorithms. Hadar leverages the performance traits of DNN jobs on a heterogeneous cluster, characterizes the task-level performance heterogeneity in the optimization problem, and makes scheduling decisions across both spatial and temporal dimensions. The primal-dual framework is employed, with our design of a dual subroutine, to solve the optimization problem and guide the scheduling design. Extensive trace-driven simulations with representative DNN models have been conducted to demonstrate that Hadar improves the average job completion time (JCT) by 3× over an Apache YARN-based resource manager used in production. Moreover, Hadar outperforms Gavel[1], the state-of-the-art heterogeneity-aware scheduler, by 2.5× for the average JCT, and shortens the queuing delay by 13% and improve FTF (Finish-Time-Fairness) by 1.5%.
more » « less
Full Text Available
FedClust: Optimizing Federated Learning on Non-IID Data Through Weight-Driven Client Clustering

https://doi.org/10.1109/IPDPSW63119.2024.00200

Islam, Md Sirajul; Javaherian, Simin; Xu, Fei; Yuan, Xu; Chen, Li; Tzeng, Nian-Feng (May 2024, IEEE)

Full Text Available
SEAFL: Enhancing Efficiency in Semi-Asynchronous Federated Learning Through Adaptive Aggregation and Selective Training

https://doi.org/10.1109/IPDPS64566.2025.00052

Islam, Md Sirajul; Panta, Sanjeev; Xu, Fei; Yuan, Xu; Chen, Li; Tzeng, Nian-Feng (June 2025, IEEE International Parallel & Distributed Processing Symposium (IPDPS 2025))

Federated Learning (FL) is a promising distributed machine learning framework that allows collaborative learning of a global model across decentralized devices without uploading their local data. However, in real-world FL scenarios, the conventional synchronous FL mechanism suffers from inefficient training caused by slow-speed devices, commonly known as stragglers, especially in heterogeneous communication environments. Though asynchronous FL effectively tackles the efficiency challenge, it induces substantial system overheads and model degradation. Striking for a balance, semi-asynchronous FL has gained increasing attention, while still suffering from the open challenge of stale models, where newly arrived updates are calculated based on outdated weights that easily hurt the convergence of the global model. In this paper, we present SEAFL, a novel FL framework designed to mitigate both the straggler and the stale model challenges in semi-asynchronous FL. SEAFL dynamically assigns weights to uploaded models during aggregation based on their staleness and importance to the current global model. We theoretically analyze the convergence rate of SEAFL and further enhance the training efficiency with an extended variant that allows partial training on slower devices, enabling them to contribute to global aggregation while reducing excessive waiting times. We evaluate the effectiveness of SEAFL through extensive experiments on three benchmark datasets. The experimental results demonstrate that SEAFL outperforms its closest counterpart by up to ∼22% in terms of the wall-clock training time required to achieve target accuracy.
more » « less
Free, publicly-accessible full text available June 3, 2026
Quantum Vulnerability Analysis to Guide Robust Quantum Computing System Design

https://doi.org/10.1109/TQE.2023.3343625

Qi, Fang; Smith, Kaitlin N; LeCompte, Travis; Tzeng, Nian-feng; Yuan, Xu; Chong, Frederic T; Peng, Lu (January 2024, IEEE Transactions on Quantum Engineering)

Full Text Available
Graph Neural Network Assisted Quantum Compilation for Qubit Allocation

https://doi.org/10.1145/3583781.3590300

LeCompte, Travis; Qi, Fang; Yuan, Xu; Tzeng, Nian-Feng; Najafi, M. Hassan; Peng, Lu (June 2023, Proceedings of 33rd Great Lakes Symposium on VLSI (GLSVLSI))

Quantum computers in the current noisy intermediate-scale quantum (NISQ) era face two major limitations - size and error vulnerability. Although quantum error correction (QEC) methods exist, they are not applicable at the current size of computers, requiring thousands of qubits, while NISQ systems have nearly one hundred at most. One common approach to improve reliability is to adjust the compilation process to create a more reliable final circuit, where the two most critical compilation decisions are the qubit allocation and qubit routing problems. We focus on solving the qubit allocation problem and identifying initial layouts that result in a reduction of error. To identify these layouts, we combine reinforcement learning with a graph neural network (GNN)-based Q-network to process the mesh topology of the quantum computer, known as the backend, and make mapping decisions, creating a Graph Neural Network Assisted Quantum Compilation (GNAQC) strategy. We train the architecture using a set of four backends and six circuits and find that GNAQC improves output fidelity by roughly 12.7% over pre-existing allocation methods.
more » « less
Full Text Available
Cascade Variational Auto-Encoder for Hierarchical Disentanglement

https://doi.org/10.1145/3511808.3557254

Lin, Fudong; Yuan, Xu; Peng, Lu; Tzeng, Nian-Feng (October 2022, Proceedings of the 31st ACM International Conference on Information & Knowledge Management)

Full Text Available

« Prev Next »

Search for: All records